Goto

Collaborating Authors

 Edirne Province


The Transition from Centralized Machine Learning to Federated Learning for Mental Health in Education: A Survey of Current Methods and Future Directions

Ebrahimi, Maryam, Sahay, Rajeev, Hosseinalipour, Seyyedali, Akram, Bita

arXiv.org Artificial Intelligence

Research has increasingly explored the application of artificial intelligence (AI) and machine learning (ML) within the mental health domain to enhance both patient care and healthcare provider efficiency. Given that mental health challenges frequently emerge during early adolescence -- the critical years of high school and college -- investigating AI/ML-driven mental health solutions within the education domain is of paramount importance. Nevertheless, conventional AI/ML techniques follow a centralized model training architecture, which poses privacy risks due to the need for transferring students' sensitive data from institutions, universities, and clinics to central servers. Federated learning (FL) has emerged as a solution to address these risks by enabling distributed model training while maintaining data privacy. Despite its potential, research on applying FL to analyze students' mental health remains limited. In this paper, we aim to address this limitation by proposing a roadmap for integrating FL into mental health data analysis within educational settings. We begin by providing an overview of mental health issues among students and reviewing existing studies where ML has been applied to address these challenges. Next, we examine broader applications of FL in the mental health domain to emphasize the lack of focus on educational contexts. Finally, we propose promising research directions focused on using FL to address mental health issues in the education sector, which entails discussing the synergies between the proposed directions with broader human-centered domains. By categorizing the proposed research directions into short- and long-term strategies and highlighting the unique challenges at each stage, we aim to encourage the development of privacy-conscious AI/ML-driven mental health solutions.


Multi-objective Combinatorial Methodology for Nuclear Reactor Site Assessment: A Case Study for the United States

Erdem, Omer, Daley, Kevin, Hoelzle, Gabrielle, Radaideh, Majdi I.

arXiv.org Artificial Intelligence

As the global demand for clean energy intensifies to achieve sustainability and net-zero carbon emission goals, nuclear energy stands out as a reliable solution. However, fully harnessing its potential requires overcoming key challenges, such as the high capital costs associated with nuclear power plants (NPPs). One promising strategy to mitigate these costs involves repurposing sites with existing infrastructure, including coal power plant (CPP) locations, which offer pre-built facilities and utilities. Additionally, brownfield sites - previously developed or underutilized lands often impacted by industrial activity - present another compelling alternative. These sites typically feature valuable infrastructure that can significantly reduce the costs of NPP development. This study introduces a novel multi-objective optimization methodology, leveraging combinatorial search to evaluate over 30,000 potential NPP sites in the United States. Our approach addresses gaps in the current practice of assigning pre-determined weights to each site attribute that could lead to bias in the ranking. Each site is assigned a performance-based score, derived from a detailed combinatorial analysis of its site attributes. The methodology generates a comprehensive database comprising site locations (inputs), attributes (outputs), site score (outputs), and the contribution of each attribute to the site score (outputs). We then use this database to train a machine learning neural network model, enabling rapid predictions of nuclear siting suitability across any location in the contiguous United States.


LegalTurk Optimized BERT for Multi-Label Text Classification and NER

Zeidi, Farnaz, Amasyali, Mehmet Fatih, Erol, Çiğdem

arXiv.org Artificial Intelligence

The introduction of the Transformer neural network, along with techniques like self-supervised pre-training and transfer learning, has paved the way for advanced models like BERT. Despite BERT's impressive performance, opportunities for further enhancement exist. To our knowledge, most efforts are focusing on improving BERT's performance in English and in general domains, with no study specifically addressing the legal Turkish domain. Our study is primarily dedicated to enhancing the BERT model within the legal Turkish domain through modifications in the pre-training phase. In this work, we introduce our innovative modified pre-training approach by combining diverse masking strategies. In the fine-tuning task, we focus on two essential downstream tasks in the legal domain: name entity recognition and multi-label text classification. To evaluate our modified pre-training approach, we fine-tuned all customized models alongside the original BERT models to compare their performance. Our modified approach demonstrated significant improvements in both NER and multi-label text classification tasks compared to the original BERT model. Finally, to showcase the impact of our proposed models, we trained our best models with different corpus sizes and compared them with BERTurk models. The experimental results demonstrate that our innovative approach, despite being pre-trained on a smaller corpus, competes with BERTurk.


Bridging the Bosphorus: Advancing Turkish Large Language Models through Strategies for Low-Resource Language Adaptation and Benchmarking

Acikgoz, Emre Can, Erdogan, Mete, Yuret, Deniz

arXiv.org Artificial Intelligence

Large Language Models (LLMs) are becoming crucial across various fields, emphasizing the urgency for high-quality models in underrepresented languages. This study explores the unique challenges faced by low-resource languages, such as data scarcity, model selection, evaluation, and computational limitations, with a special focus on Turkish. We conduct an in-depth analysis to evaluate the impact of training strategies, model choices, and data availability on the performance of LLMs designed for underrepresented languages. Our approach includes two methodologies: (i) adapting existing LLMs originally pretrained in English to understand Turkish, and (ii) developing a model from the ground up using Turkish pretraining data, both supplemented with supervised fine-tuning on a novel Turkish instruction-tuning dataset aimed at enhancing reasoning capabilities. The relative performance of these methods is evaluated through the creation of a new leaderboard for Turkish LLMs, featuring benchmarks that assess different reasoning and knowledge skills. Furthermore, we conducted experiments on data and model scaling, both during pretraining and fine-tuning, simultaneously emphasizing the capacity for knowledge transfer across languages and addressing the challenges of catastrophic forgetting encountered during fine-tuning on a different language. Our goal is to offer a detailed guide for advancing the LLM framework in low-resource linguistic contexts, thereby making natural language processing (NLP) benefits more globally accessible.


Context-dependent Explainability and Contestability for Trustworthy Medical Artificial Intelligence: Misclassification Identification of Morbidity Recognition Models in Preterm Infants

Guzey, Isil, Ucar, Ozlem, Ciftdemir, Nukhet Aladag, Acunas, Betul

arXiv.org Artificial Intelligence

Although machine learning (ML) models of AI achieve high performances in medicine, they are not free of errors. Empowering clinicians to identify incorrect model recommendations is crucial for engendering trust in medical AI. Explainable AI (XAI) aims to address this requirement by clarifying AI reasoning to support the end users. Several studies on biomedical imaging achieved promising results recently. Nevertheless, solutions for models using tabular data are not sufficient to meet the requirements of clinicians yet. This paper proposes a methodology to support clinicians in identifying failures of ML models trained with tabular data. We built our methodology on three main pillars: decomposing the feature set by leveraging clinical context latent space, assessing the clinical association of global explanations, and Latent Space Similarity (LSS) based local explanations. We demonstrated our methodology on ML-based recognition of preterm infant morbidities caused by infection. The risk of mortality, lifelong disability, and antibiotic resistance due to model failures was an open research question in this domain. We achieved to identify misclassification cases of two models with our approach. By contextualizing local explanations, our solution provides clinicians with actionable insights to support their autonomy for informed final decisions.


Structure Aware Negative Sampling in Knowledge Graphs

Ahrabian, Kian, Feizi, Aarash, Salehi, Yasmin, Hamilton, William L., Bose, Avishek Joey

arXiv.org Machine Learning

Learning low-dimensional representations for entities and relations in knowledge graphs using contrastive estimation represents a scalable and effective method for inferring connectivity patterns. A crucial aspect of contrastive learning approaches is the choice of corruption distribution that generates hard negative samples, which force the embedding model to learn discriminative representations and find critical characteristics of observed data. While earlier methods either employ too simple corruption distributions, i.e. uniform, yielding easy uninformative negatives or sophisticated adversarial distributions with challenging optimization schemes, they do not explicitly incorporate known graph structure resulting in suboptimal negatives. In this paper, we propose Structure Aware Negative Sampling (SANS), an inexpensive negative sampling strategy that utilizes the rich graph structure by selecting negative samples from a node's k-hop neighborhood. Empirically, we demonstrate that SANS finds semantically meaningful negatives and is competitive with SOTA approaches while requires no additional parameters nor difficult adversarial optimization.


SARS-CoV-2 virus RNA sequence classification and geographical analysis with convolutional neural networks approach

Yazar, Selcuk

arXiv.org Artificial Intelligence

SARS-CoV-2 virus RNA sequence classification and geographical analysis with convolutional neural networks approach. Abstract Covid-19 infection, which spread to the whole world in December 2019 and is still active, caused more than 250 thousand deaths in the world today. Researches on this subject have been focused on analyzing the genetic structure of the virus, developing vaccines, the course of the disease, and its source. In this study, RNA sequences belonging to the SARS-CoV-2 virus are transformed into gene motifs with two basic image processing algorithms and classified with the convolutional neural network (CNN) models. The CNN models achieved an average of 98% Area Under Curve(AUC) value was achieved in RNA sequences classified as Asia, Europe, America, and Oceania. The resulting artificial neural network model was used for phylogenetic analysis of the variant of the virus isolated in Turkey. The classification results reached were compared with gene alignment values in the GISAID database, where SARS-CoV-2 virus records are kept all over the world. Our experimental results have revealed that now the detection of the geographic distribution of the virus with the CNN models might serve as an efficient method. Keywords: Deep Learning, Bioinformatics, Convolutional neural network, SARS-Cov-2, Pattern Classification Introduction Artificial intelligence practices and particularly deep learning studies are a widely used discipline in many research fields, including medicine and bioinformatics. The CNN models, especially in the field of medical imaging, are very successful in lesions and disease diagnosis. In addition to the success of deep learning methods in the fields of image processing, natural language processing, also has a lot of usage on a time scale with approaches such as Long-Short Term memory. In deep learning practices, low-level features such as DNA sequence, pathology images, and tomography scans can be learned from the data, by largely eliminating the need for engineering applications.


UK to invest £2.6M in drone and satellite tech to deliver vital supplies

Daily Mail - Science & tech

The UK government is setting aside £2.6 million for new satellite and drone technology that could deliver essential supplies during the coronavirus lockdown. The UK Space Agency (UKSA) is funding new solutions to deliver equipment such as test kits, masks, gowns and goggles for frontline NHS staff. The joint initiative with the European Space Agency could lead to vital equipment soaring through British skies via drones to support the NHS in tackling COVID-19. Companies can submit their proposals, including ideas for deployment and a pilot phase, on the European Space Agency (ESA) website. The UK's space industry is also looking for ways to combat the spread of coronavirus and preventing future epidemics using satellites.